Goto

Collaborating Authors

 share data


Google avoids break-up but must share data with rivals

BBC News

Google had proposed less drastic solutions, such as limiting its revenue-sharing agreements with firms like Apple to make its search engine the default on their devices and browsers. On Tuesday, the company indicated that it viewed the ruling as a victory, and said the rise of artificial intelligence (AI) probably contributed to the outcome. "Today's decision recognizes how much the industry has changed through the advent of AI, which is giving people so many more ways to find information," Google said in a statement after the ruling. "This underlines what we've been saying since this case was filed in 2020: Competition is intense and people can easily choose the services they want," the statement continued. The tech giant had denied wrongdoing since charges were first filed against it in 2020, saying its market dominance is because its search engine is a superior product to others and consumers simply prefer it to others.


Predicting and Explaining Customer Data Sharing in the Open Banking

de Brito, João B. G., Heldt, Rodrigo, Silveira, Cleo S., Bogaert, Matthias, Bucco, Guilherme B., Luce, Fernando B., Becker, João L., Zabala, Filipe J., Anzanello, Michel J.

arXiv.org Artificial Intelligence

The emergence of Open Banking represents a significant shift in financial data management, influencing financial institutions' market dynamics and marketing strategies. This increased competition creates opportunities and challenges, as institutions manage data inflow to improve products and services while mitigating data outflow that could aid competitors. This study introduces a framework to predict customers' propensity to share data via Open Banking and interprets this behavior through Explanatory Model Analysis (EMA). Using data from a large Brazilian financial institution with approximately 3.2 million customers, a hybrid data balancing strategy incorporating ADASYN and NEARMISS techniques was employed to address the infrequency of data sharing and enhance the training of XGBoost models. These models accurately predicted customer data sharing, achieving 91.39% accuracy for inflow and 91.53% for outflow. The EMA phase combined the Shapley Additive Explanations (SHAP) method with the Classification and Regression Tree (CART) technique, revealing the most influential features on customer decisions. Key features included the number of transactions and purchases in mobile channels, interactions within these channels, and credit-related features, particularly credit card usage across the national banking system. These results highlight the critical role of mobile engagement and credit in driving customer data-sharing behaviors, providing financial institutions with strategic insights to enhance competitiveness and innovation in the Open Banking environment.


The Impact of Transparency in AI Systems on Users' Data-Sharing Intentions: A Scenario-Based Experiment

Rosenberger, Julian, Kuhlemann, Sophie, Tiefenbeck, Verena, Kraus, Mathias, Zschech, Patrick

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) systems are frequently employed in online services to provide personalized experiences to users based on large collections of data. However, AI systems can be designed in different ways, with black-box AI systems appearing as complex data-processing engines and white-box AI systems appearing as fully transparent data-processors. As such, it is reasonable to assume that these different design choices also affect user perception and thus their willingness to share data. To this end, we conducted a pre-registered, scenario-based online experiment with 240 participants and investigated how transparent and non-transparent data-processing entities influenced data-sharing intentions. Surprisingly, our results revealed no significant difference in willingness to share data across entities, challenging the notion that transparency increases data-sharing willingness. Furthermore, we found that a general attitude of trust towards AI has a significant positive influence, especially in the transparent AI condition, whereas privacy concerns did not significantly affect data-sharing decisions.


Privacy-preserving federated prediction of pain intensity change based on multi-center survey data

Das, Supratim, Rafie, Mahdie, Kammer, Paula, Skou, Søren T., Grønne, Dorte T., Roos, Ewa M., Hajek, André, König, Hans-Helmut, Ullaha, Md Shihab, Probul, Niklas, Baumbacha, Jan, Baumbach, Linda

arXiv.org Artificial Intelligence

Background: Patient-reported survey data are used to train prognostic models aimed at improving healthcare. However, such data are typically available multi-centric and, for privacy reasons, cannot easily be centralized in one data repository. Models trained locally are less accurate, robust, and generalizable. We present and apply privacy-preserving federated machine learning techniques for prognostic model building, where local survey data never leaves the legally safe harbors of the medical centers. Methods: We used centralized, local, and federated learning techniques on two healthcare datasets (GLA:D data from the five health regions of Denmark and international SHARE data of 27 countries) to predict two different health outcomes. We compared linear regression, random forest regression, and random forest classification models trained on local data with those trained on the entire data in a centralized and in a federated fashion. Results: In GLA:D data, federated linear regression (R2 0.34, RMSE 18.2) and federated random forest regression (R2 0.34, RMSE 18.3) models outperform their local counterparts (i.e., R2 0.32, RMSE 18.6, R2 0.30, RMSE 18.8) with statistical significance. We also found that centralized models (R2 0.34, RMSE 18.2, R2 0.32, RMSE 18.5, respectively) did not perform significantly better than the federated models. In SHARE, the federated model (AC 0.78, AUROC: 0.71) and centralized model (AC 0.84, AUROC: 0.66) perform significantly better than the local models (AC: 0.74, AUROC: 0.69). Conclusion: Federated learning enables the training of prognostic models from multi-center surveys without compromising privacy and with only minimal or no compromise regarding model performance.


How amalgamated learning could scale medical AI

#artificialintelligence

AI shows tremendous promise in discovering new patterns buried in mountains of data. Yet, some data remains isolated across various silos for technical, ethical and commercial reasons. A promising new AI and machine learning technique called amalgamated learning might help overcome these silos to find new cures for diseases, prevent fraud and improve industrial equipment. It may also provide a way to construct digital twins from inconsistent forms of data. At the Imec Future Summits conference, Roel Wuyts detailed how amalgamated learning works and how it compares to related techniques like federated learning and homomorphic encryption in an exclusive interview with VentureBeat.


Gartner reveals top trends in data and analytics for 2022

#artificialintelligence

Gartner has identified three key areas that it believes data and analytics (D&A) companies should move to implement into their strategy in 2022. "This year's top D&A trends represent business, market and technology dynamics that will help organisations anticipate change and transform uncertainty into opportunity, both of which have come under the purview of the D&A leader," Gartner distinguished research vice president Rita Sallam says. Gartner says the rise of adaptive AI systems, such as AI engineering, drives growth and innovation while coping with fluctuations in global markets, adding that innovations in this area are needed to best utilise D&A. In particular, data management for AI, automated, active metadata-driven approaches and data-sharing competencies that are all founded on data fabrics. Citing the'always share data' trend, Gartner says this proves the value of data sharing as a business-facing key performance indicator by demonstrating that an organisation is achieving effective stakeholder engagement and increasing access to the right data to generate public value.


Getting value from your data shouldn't be this hard

MIT Technology Review

The potential impact of the ongoing worldwide data explosion continues to excite the imagination. A 2018 report estimated that every second of every day, every person produces 1.7 MB of data on average--and annual data creation has more than doubled since then and is projected to more than double again by 2025. A report from McKinsey Global Institute estimates that skillful uses of big data could generate an additional $3 trillion in economic activity, enabling applications as diverse as self-driving cars, personalized health care, and traceable food supply chains. But adding all this data to the system is also creating confusion about how to find it, use it, manage it, and legally, securely, and efficiently share it. Where did a certain dataset come from?


Taking personalisation to the next level

#artificialintelligence

Identifying how much personalization to offer – and to whom – will separate winners from losers. Hyper-personalization is one of three areas we focus on as part of the digital consumption cross-industry theme. The other themes we examine are products and services to experiences and ownership to access. More than 70% of customers now expect more personalized experiences with the brands they interact with,¹ and digital technology is enabling companies to meet these expectations by delivering personalization to large numbers of customers at a low cost. Spectacular advances in artificial intelligence (AI) and software intelligence are enabling companies to take personalization to the next level, making products and services highly relevant to a very large number of customers at the same time.


"Data Trusts" Could Be the Key to Better AI

#artificialintelligence

One of the challenges in developing AI applications is obtaining the vast amount of data that's required. Making matters worse, regulations and privacy issues pose obstacles to firms' sharing their data. A possible solution is for firms to form a "data trust." Willis Towers Watson recently piloted a data trust together with several of its clients. This article shared what they learned about how to create such a trust.


"Data Trusts" Could Be the Key to Better AI

#artificialintelligence

One of the greatest barriers to adopting and scaling AI applications is the scarcity of varied, high-quality raw data. To overcome it, firms need to share their data. But the many regulatory restrictions and ethical issues surrounding data privacy pose a major obstacle to doing this. A novel solution that my firm is piloting that could solve this problem is a data trust: an independent organization that serves as a fiduciary for the data providers and governs their data's proper use. Research shows that companies are becoming increasingly aware of the value of sharing data and are exploring ways to do so with other players in their industry or across industries.